Speech analysis and coding using a multi-resolution sinusoidal transform
نویسنده
چکیده
The sinusoidal transform, as developed by Quatieri and McAulay, provides a sparse representation for speech signals by taking advantage of psychoacoustic masking. The currently reported work takes the sinusoidal transform one step further by considering the frequency resolution abilities of the human auditory system in more detail. The new transform is based on the wavelet principle of variable resolution in time/frequency analysis. Speciically, a sinu-soidal transform is developed which uses quadrature mirror lter (QMF) banks to obtain better time resolution at high frequencies and better frequency resolution at low frequencies. This naturally provides a perceptually improved allocation of the sinusoids. The new transform matches the human auditory system better than its predecessor and it also matches speech signals well, both fricative sounds and voiced speech. The QMF based ST is then shown to be equivalent to a more eecient FFT based implementation.
منابع مشابه
A new sinusoidal phase modeling algorithm
Sassan Ahmadi and Andreas S. Spanias Department of Electrical Engineering Telecommunications Research Center Arizona State University Tempe, AZ 85287-7206 USA ABSTRACT A new phase modeling algorithm for sinusoidal analysis and synthesis of speech signals is presented. Short-time sinusoidal phases are e ciently approximated by incorporating linear prediction, spectral sampling, delay compensatio...
متن کاملMulti-rate speech coding for wireless and Internet applications
Fixed-rate speech codecs are unable to provide synthesized speech with fixed delay when the channel capacity changes, and can not dedicate additional forward error correction bits for protection against noisy channels. We propose a multi-rate method for variable bandwidth applications, such as the Internet, and severely degraded wireless channels, such as mobile cellular. The technique uses a m...
متن کاملA mixed sinusoidally excited linear prediction coder at 4 kb/s and below
There is currently a great deal of interest in the development of speech coding algorithms capable of delivering toll quality at 4 kb/s and below. For synthesizing high quality speech, accurate representation of the voiced portions of speech is essential. For bit rates of 4 kb/s and below, conventional Code Excited Linear Prediction (CELP) may likely not provide the appropriate degree of period...
متن کاملHigh quality MELP coding at bit-rates around 4 kb/s
Recently, a number of coding techniques have been reported to achieve near toll quality synthesized speech at bit-rates around 4 kb/s. These include variants of Code Excited Linear Prediction (CELP), Sinusoidal Transform Coding (STC) and Multi-Band Excitation (MBE). While CELP has been an effective technique for bit-rates above 6 kb/s, STC, MBE, Waveform Interpolation (WI) and Mixed Excitation ...
متن کاملA hybrid sub-band sinusoidal coding scheme
This paper describes a hybrid sub-band speech coding scheme based on sinusoidal coding and CELP. Purely voiced speech is encoded using sinusoidal coding techniques and phase information is selectively transmitted. For mixed and unvoiced speech, the lower band is processed by sinusoidal coding algorithms while the upper band is encoded using CELP. To accommodate the extra bandwidth required by t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996